PUCMinas and IRISA at Multimodal Person Discovery
نویسندگان
چکیده
This paper describes the systems developed by PUC Minas and IRISA for the person discovery task at MediaEval 2016. We adopt a graph-based representation and investigate two tag-propagation approaches to associate overlays cooccurring with some speaking faces to other visually or audio-visually similar speaking faces. Given a video, we first build a graph from the detected speaking faces (nodes) and their audio-visual similarities (edges). Each node is associated to its co-occurring overlays (tags) when they exist. Then, we consider two tagpropagation approaches, respectively based on a random walk strategy and on Kruskal’s algorithm.
منابع مشابه
SSIG and IRISA at Multimodal Person Discovery
This paper describes our approach and results in the multimodal person discovery in broadcast TV task at MediaEval 2015. We investigate two distinct aspects of multimodal person discovery. One refers to face clusters, which are considered to propagate names associated to faces in one shot to other faces that probably belong to the same person. The face clustering approach consists in calculatin...
متن کاملBenchmarking multimedia technologies with the CAMOMILE platform: the case of Multimodal Person Discovery at MediaEval 2015
In this paper, we claim that the CAMOMILE collaborative annotation platform (developed in the framework of the eponymous CHIST-ERA project) eases the organization of multimedia technology benchmarks, automating most of the campaign technical workflow and enabling collaborative (hence faster and cheaper) annotation of the evaluation data. This is demonstrated through the successful organization ...
متن کاملThe MODIS software for word like motif discovery and its use for zero resource audio summarization
MODIS is a free audio motif discovery software developed at IRISA Rennes. Motif discovery is the task of discovering and collecting occurrences of repeating patterns in the absence of prior knowledge, or training material. In the case of speech, those motifs could be word since MODIS is tolerant to motif variability. The algorithm implementation allows to process large audio streams at a reason...
متن کاملGTM-UVigo System for Multimodal Person Discovery in Broadcast TV Task at MediaEval 2016
In this paper, we present the system developed by GTMUVigo team for the Multimedia Person Discovery in Broadcast TV task at MediaEval 2016. The proposed approach consists in a novel strategy for person discovery which is not based on speaker and face diarisation as in previous works. In this system, the task is approached as a person recognition problem: there is an enrolment stage, where the v...
متن کاملTokyo Tech at MediaEval 2016 Multimodal Person Discovery in Broadcast TV task
This paper describes our diarization system for the Multimodal Person Discovery in Broadcast TV task of the MediaEval 2016 Benchmark evaluation campaign [1]. The goal of this task is naming speakers, who are appearing and speaking simultaneously in the video, without prior knowledge. Our diarization system relies on face diarization approach. We extract deep features from a face every 0.5 secon...
متن کامل